22 research outputs found
Fast and Interpretable Nonlocal Neural Networks for Image Denoising via Group-Sparse Convolutional Dictionary Learning
Nonlocal self-similarity within natural images has become an increasingly
popular prior in deep-learning models. Despite their successful image
restoration performance, such models remain largely uninterpretable due to
their black-box construction. Our previous studies have shown that
interpretable construction of a fully convolutional denoiser (CDLNet), with
performance on par with state-of-the-art black-box counterparts, is achievable
by unrolling a dictionary learning algorithm. In this manuscript, we seek an
interpretable construction of a convolutional network with a nonlocal
self-similarity prior that performs on par with black-box nonlocal models. We
show that such an architecture can be effectively achieved by upgrading the
sparsity prior of CDLNet to a weighted group-sparsity prior. From this
formulation, we propose a novel sliding-window nonlocal operation, enabled by
sparse array arithmetic. In addition to competitive performance with black-box
nonlocal DNNs, we demonstrate the proposed sliding-window sparse attention
enables inference speeds greater than an order of magnitude faster than its
competitors.Comment: 11 pages, 8 figures, 6 table
Lateralization in the dichotic listening of tones is influenced by the content of speech
Available online 10 February 2020.Cognitive functions, for example speech processing, are distributed asymmetrically in the two hemispheres that mostly have homologous anatomical structures.
Dichotic listening is a well-established paradigm to investigate hemispherical lateralization of speech. However, the mixed results of dichotic listening, especially
when using tonal languages as stimuli, complicates the investigation of functional lateralization. We hypothesized that the inconsistent results in dichotic listening
are due to an interaction in processing a mixture of acoustic and linguistic attributes that are differentially processed over the two hemispheres. In this study, a
within-subject dichotic listening paradigm was designed, in which different levels of speech and linguistic information was incrementally included in different
conditions that required the same tone identification task. A left ear advantage (LEA), in contrast with the commonly found right ear advantage (REA) in dichotic
listening, was observed in the hummed tones condition, where only the slow frequency modulation of tones was included. However, when phonemic and lexical
information was added in simple vowel tone conditions, the LEA became unstable. Furthermore, ear preference became balanced when phonological and lexicalsemantic
attributes were included in the consonant-vowel (CV), pseudo-word, and word conditions. Compared with the existing REA results that use complex
vowel word tones, a complete pattern emerged gradually shifting from LEA to REA. These results support the hypothesis that an acoustic analysis of suprasegmental
information of tones is preferably processed in the right hemisphere, but is influenced by phonological and lexical semantic processes residing in the left hemisphere.
The ear preference in dichotic listening depends on the levels of speech and linguistic analysis and preferentially lateralizes across the different hemispheres. That is,
the manifestation of functional lateralization depends on the integration of information across the two hemispheres.This study was supported by National Natural Science Foundation of
China 31871131, Major Program of Science and Technology Commission
of Shanghai Municipality (STCSM) 17JC1404104, Program of
Introducing Talents of Discipline to Universities, Base B16018 to XT, and
the JRI Seed Grants for Research Collaboration from NYU-ECNU Institute
of Brain and Cognitive Science at NYU Shanghai to XT and QC, and
NIH 2R01DC05660 to David Poeppel at New York University supporting
NM and AF and F32 DC011985 to AF
Reconstructing Speech from Human Auditory Cortex
Direct brain recordings from neurosurgical patients listening to speech reveal that the acoustic speech signals can be reconstructed from neural activity in auditory cortex
Recommended from our members
iEEG-BIDS, extending the Brain Imaging Data Structure specification to human intracranial electrophysiology
The Brain Imaging Data Structure (BIDS) is a community-driven specification for organizing neuroscience data and metadata with the aim to make datasets more transparent, reusable, and reproducible. Intracranial electroencephalography (iEEG) data offer a unique combination of high spatial and temporal resolution measurements of the living human brain. To improve internal (re)use and external sharing of these unique data, we present a specification for storing and sharing iEEG data: iEEG-BIDS
Recommended from our members
The electrophysiology of language perception and production
For over a century, an abundance of research has tried to elucidate the neurobiological basis of language processing in the human cortex. Neuroimaging and lesion studies have provided great insight into what functions different brain structures subserve. While these techniques provide a high spatial resolution they are limited in the temporal domain. Conversely, contributions from non-invasive electrophysiology provided a high temporal resolution with a limited ability to localize cortical sources. The combined spatial and temporal dynamics of cortical processing during language perception and production remains largely unknown. This dissertation addresses this issue by employing unique neuronal population recordings from neurosurgical patients performing linguistic tasks. The studies described here elucidate the timing, magnitude and spatial extent of cortical processing during perception and production of language. The results provide evidence on the level of single-trial that: 1) A rich network of independent and spatially distinct functional sub-regions of cortex subserve perception and production of language. 2) Neighboring sub-regions 4 mm apart can exhibit inverse functional specific responses to linguistic stimuli and self produced speech. 3) Broca's area is not involved in the actual act of articulation but rather in speech preparation and interfacing perception and production. Taken together, these results defy century old dogmas and suggest that language is supported by a complex network of independent sub-regions, with Broca's area acting as a mediator between perception and production rather than as the seat of articulation
Recommended from our members
The electrophysiology of language perception and production
For over a century, an abundance of research has tried to elucidate the neurobiological basis of language processing in the human cortex. Neuroimaging and lesion studies have provided great insight into what functions different brain structures subserve. While these techniques provide a high spatial resolution they are limited in the temporal domain. Conversely, contributions from non-invasive electrophysiology provided a high temporal resolution with a limited ability to localize cortical sources. The combined spatial and temporal dynamics of cortical processing during language perception and production remains largely unknown. This dissertation addresses this issue by employing unique neuronal population recordings from neurosurgical patients performing linguistic tasks. The studies described here elucidate the timing, magnitude and spatial extent of cortical processing during perception and production of language. The results provide evidence on the level of single-trial that: 1) A rich network of independent and spatially distinct functional sub-regions of cortex subserve perception and production of language. 2) Neighboring sub-regions 4 mm apart can exhibit inverse functional specific responses to linguistic stimuli and self produced speech. 3) Broca's area is not involved in the actual act of articulation but rather in speech preparation and interfacing perception and production. Taken together, these results defy century old dogmas and suggest that language is supported by a complex network of independent sub-regions, with Broca's area acting as a mediator between perception and production rather than as the seat of articulation
Neural correlates of sign language production revealed by electrocorticography
Objective: The combined spatiotemporal dynamics underlying sign language production remain largely unknown. To investigate these dynamics compared to speech production, we used intracranial electrocorticography during a battery of language tasks. Methods: We report a unique case of direct cortical surface recordings obtained from a neurosurgical patient with intact hearing who is bilingual in English and American Sign Language. We designed a battery of cognitive tasks to capture multiple modalities of language processing and production. Results: We identified 2 spatially distinct cortical networks: ventral for speech and dorsal for sign production. Sign production recruited perirolandic, parietal, and posterior temporal regions, while speech production recruited frontal, perisylvian, and perirolandic regions. Electrical cortical stimulation confirmed this spatial segregation, identifying mouth areas for speech production and limb areas for sign production. The temporal dynamics revealed superior parietal cortex activity immediately before sign production, suggesting its role in planning and producing sign language. Conclusions: Our findings reveal a distinct network for sign language and detail the temporal propagation supporting sign production
Human Screams Occupy a Privileged Niche in the Communication Soundscape
Screaming is arguably one of the most relevant communication signals for survival in humans. Despite their practical relevance and their theoretical significance as innate [1] and virtually universal [2, 3] vocalizations, what makes screams a unique signal and how they are processed is not known. Here, we use acoustic analyses, psychophysical experiments, and neuroimaging to isolate those features that confer to screams their alarming nature, and we track their processing in the human brain. Using the modulation power spectrum (MPS [4, 5]), a recently developed, neurally informed characterization of sounds, we demonstrate that human screams cluster within restricted portion of the acoustic space (between ∼30 and 150 Hz modulation rates) that corresponds to a well-known perceptual attribute, roughness. In contrast to the received view that roughness is irrelevant for communication [6], our data reveal that the acoustic space occupied by the rough vocal regime is segregated from other signals, including speech, a pre-requisite to avoid false alarms in normal vocal communication. We show that roughness is present in natural alarm signals as well as in artificial alarms and that the presence of roughness in sounds boosts their detection in various tasks. Using fMRI, we show that acoustic roughness engages subcortical structures critical to rapidly appraise danger. Altogether, these data demonstrate that screams occupy a privileged acoustic niche that, being separated from other communication signals, ensures their biological and ultimately social efficiency